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Abstract — Radio Frequency Identification (RFID) systems are 
gaining momentum in various applications of logistics, inventory, 
etc. A generic problem in such systems is to ensure that the 
RFID readers can reliably read a set of RFID tags, such that the 
probability of missing tags stays below an acceptable value. A tag 
may be missing (left unread) due to errors in the communication 
link towards the reader e.g. due to obstacles in the radio path. 
The present paper proposes techniques that use multiple reader 
sessions, during which the system of readers obtains a running 
estimate of the probability to have at least one tag missing. 
Based on such an estimate, it is decided whether an additional 
reader session is required. Two methods are proposed, they 
rely on the statistical independence of the tag reading errors 
across different reader sessions, which is a plausible assumption 
when e.g. each reader session is executed on different readers. 
The first method uses statistical relationships that are valid 
when the reader sessions are independent. The second method is 
obtained by modifying an existing capture-recapture estimator. 
The results show that, when the reader sessions are independent, 
the proposed mechanisms provide a good approximation to the 
probability of missing tags, such that the number of reader 
sessions made, meets the target specification. If the assumption of 
independence is violated, the estimators are still useful, but they 
should be corrected by a margin of additional reader sessions to 
ensure that the target probability of missing tags is met. 

Index Terms — Missing tag problem, set cardinality estimation, 
error probability estimation, RFID networks 

I. Introduction 

RFID technology features a growing set of applications 
for identification of various objects. The applications span 
from simply identifying objects, serving as more informa- 
tive barcodes, gathering of sensory data and holding pri- 
vate/confidential information HI dill. The advantages of 
RFID technology include the low cost per tag and the low 
energy consumption, which lets them have a very long lifetime 
111. The passive RFID tags represent a category of tags that 
does not have power supply, they rely solely on the signal 
sent from a reader to power their circuitry, and to respond by 
backscattering the signal |4|. 

The communication paradigm in the passive RFID systems 
is based on request/response: In the first step, the reader sends 
an interrogation signal to the tags within its range. In the 
second step the tags send their response to the reader by 
backscattering the signal. If multiple tags simultaneously reply 
to the reader, then the reader experiences tag collision. Hence, 
the reader should run a certain anti-collision protocol (also 
called collision-resolution or arbitration protocol) in order 
to successfully resolve each tag in its proximity. There are 



various anti-collision protocols, which are in general divided 
into two groups, ALOHA-based HIS and tree-based QlSl. 

Regardless of the actual arbitration protocol used, after a 
single run of the protocol is terminated, the reader has the 
identities of the tags in its proximity. In the ideal case, when 
there are no transmission errors and the only error experienced 
at the reader is due to the tag collisions, then one can be 
certain that all the tag identities have been collected during 
the arbitration process. However, errors do occur if either the 
query from a reader is not received correctly at a tag or the tag 
reply is not received at the reader. In principle, if a tag is at 
a blind spot |9|, then the communication between the tag and 
the reader is always in error The probability that a tag is at a 
blind spot can be substantial and is primarily determined by 
the physical disposition of the tag, but also by the material to 
which the tags are affixed. In [101 it is shown, that if a tag is 
attached to solar cream, the probability of not resolving a tag is 
30% and with mineral water it is 67%. The error probability 
can vary a lot, increasing the probability of missing one or 
more tags completely. In summary, if during the arbitration 
protocol the link between a reader and a tag is in error, then 
this tag is not identified at the end of the protocol run. This 
is defined as the missing tag problem. 

There are multiple approaches to minimize the probability 
of missing a tag. In a method for determining group 

completeness in an RFID network is described, based on each 
tag storing one or more references to surrounding tags. The 
resolved tags and the references are compared, and if not all 
references are resolved, the reading/comparison is repeated. 
Thereby the reader knows with high probability if tags are 
missing. This method is targeting rather static constellations 
of tags, e.g. goods on pallets. Another approach is presented 
in 1 9 1, where a method for resolving a set of RFID tags is 
presented. This is done by using two independent samples, in 
this case a database and RFID readings. These two samples are 
used as in a classical capture-recapture model llT2l to derive 
estimators for the tag set cardinality. 

The paper is organized as follows. An overview of the 
problem and an intuitive explanation is presented in the next 
section in the case of two reader sessions, followed by the 
system model in Section |llll Derivation of estimators for 
two reader sessions is in Section |IV] The estimators are 
generalized to multiple reader sessions in Section [V] and the 
estimators in the case of two reader sessions are evaluated 
analytically in Section IVlIl In Section rVlIII simulations show 



the performance of the proposed estimators in scenarios with 
both dependent and independent reader sessions. The work is 
concluded in Section |lXl 

II. Problem Definition 

The main idea in this paper is to use several independent 
readings of the tag set that consists of N tags. One reading 
of the tag set consists of one run of the arbitration protocol, 
and is denoted a reader session. In each reader session the 
probabiUty that a given tag is not read is p. Reader sessions are 
independent, when the probability that each tag is read in one 
reader session is independent of it being read in another. The 
value of the error probability p and tag cardinality N are not 
known a priori. At this point it is natural to ask: How can we 
assure, or at least attempt, to make the readings independent? 
Here are two plausible examples: 

1) Before the next reader session with the same reader, 
the tagged items are physically displaced/shuffled and it 
can be assumed that such an action "resets" the physical 
links and generates error with probability p. 

2) If one reader with multiple antennas or multiple readers 
are located at different positions, but remain in commu- 
nication range with the same tags, the reader sessions 
may be assumed independent. 

A scenario that encompasses both cases is the one with a con- 
veyor belt, along which several readers are deployed. It should 
be noted that these are ways to aim for independence and 
simulations show how the methods introduced underperform 
when the independence assumption does not hold. 

The basic idea of our approach leverages on the recent 
ideas about cooperative readers |fT3]| that can jointly infer 
statistical information about the set of tags S in range. In 
order to illustrate the idea, consider the case with two readers 
each having a reader session, ri and r2 respectively. The 
probability that a tag is not read in reader session is p. After 
the two reader sessions are terminated, the readers exchange 
information about the tag identities they have gathered. Let ki 
denote the subset of tags that have been read in both reader 
session ri and r2- Let k2a{k2b) be the subset of tags that are 
read only in ri(r2). This is schematically represented in Fig.[T] 
There is also a set of tags that are not read in either of the 
reader sessions. Let p and N denote the estimates of p and 
N, respectively. Based on the expected values for ki, k2a, and 
^26, one can write: 

ki = 

k2a + k2b = 2N{l-p)p (1) 

Using these two equations, we can obtain values for p and 
N . Based on that, we can estimate the expected value of the 
number of missing tags k^^. Furthermore, we can estimate 
the probability of having at least one tag missing and, if 
this probability is above a threshold value, we can perform 
additional readings. This process is generalized by devising 
methods to obtain p and N from three or more independent 
readings. The objective is to create a sequential decision 
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Fig. 1. Venn diagram of the possible tag sets for two reader sessions r\ 
and r2- fci is the number of tags found in common in both reader sessions, 
k2a is the number of tags only found in reader session r\, only in reader 
session r2. The set k2 is given by the sum of k2a and k2b- An unobserved 
number of tags, k^, may exist. 

process in which, after the i?th reader session (arbitration 
protocol run), we calculate the probability of having a tag 
remaining and, if this probability is above a threshold value, 
we carry out the {R + l)th reader session. 

For the general case of i? > 2 reader sessions, we propose 
two classes of estimators. One class of estimators is emerging 
from the generalization of Eqn. ([T]). The other class of esti- 
mators is obtained by extending a classical capture-recapture 
result by Schnabel lfT4l . in order to be able to estimate the 
error probability p. These estimator classes are the major 
contributions of this paper, along with the overall idea of 
sequential decision process in dealing with the missing tag 
problem. 

III. System Model 

The system considered consists of multiple readers and tags. 
Each reader can have multiple reader sessions, defined as a 
session in which the reader runs its arbitration protocol, trying 
to resolve the entire tag set. The outcome of a reader session 
is a set specifying the tags resolved by a reader in a reader 
session. The sets are assumed to contain no errors, meaning 
that if a tag successfully backscatters a signal to the reader 
without collision, then the tag is present in a set and the tag is 
resolved. The reader sessions are assumed to be coordinated 
in a way that the readers do not interfere with each other i.e. 
the reader collision problem |15| does not occur 

Throughout the paper we assume independent reader ses- 
sions, except in Section IVII-AI where we introduce the corre- 
lated model for evaluation of dependent reader sessions. In a 
given session, each tag will, with probability p, be in a blind 
spot, i.e. not being able to communicate with the reader. The 
complete tag set S contains N tags and remains unchanged 
through the reader sessions. The probability of error (blind 
spot) p is identical for each tag in each reader session. That 
is, for a given reader session and a given tag, the tag is made 
unreadable with probability p, independently of the other tags 
and previous reader sessions. 

We assume independence across the tags: the event whether 
a tag T„i is readable does not depend on the event whether 
another tag t; is readable. On the other hand, we introduce 
correlation by defining conditional probabilities that the tag 



Tm is readable in reader session ri+i provided that the same 
tag T,n was readable/not readable in r^. The conditional 
probabilities in the correlated case is selected such that the 
expected number of non-readable tags in the reader session 
r^+i remains Np. This is physically plausible, as we should 
not be able to improve the overall readability of the tags set 
through a random physical displacement. 

IV. Proposed Solution 

Four random variables, Ki, K2a, ^26 and K3, follow the 
multinomial distribution, and describes the number of tags in 
the sets fci, k2a, k2b, and k^, respectively, see Fig. [T] The 
probability of a tag being read in the first reader session is 
(1 —p), and in two reader sessions (1 — p)^. The probability 
of a tag being read in the first but not in the next (and vice 
versa) is (1 — p)p, and not read at all is p^. This gives the 
probability mass function (pmf) 
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Lets define one more random variable, K2, being the sum of 
K2a and K2b, then, assuming that they are independent gives 
the pmf 

Pr[i^l=fcl,/f2 = fc2,if3-fc3] 

N 

The expected values of the random variables are 

E[Ki]= N{l-p)^ 

E[K2] = E[K2a] + E[K2b] = 2N{l-p)p 
E[K3] = Np^ 

When measured values of ki and ^2 are found, and by 
assuming that they are close to their respective expected value, 
we assume the following approximation 



fcl =iV(l-p)2 

k2 = 2iV(l - p)p 
ks = Nf 
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E[K2] 
E[K^] 



(2) 



Based on this, an estimate of p can be found, by taking a ratio 
based on the set relationship, namely 



fci _ N{l-pf 
k2 ~ 27V(1 - p)p 



P 



2ki + k2 



(3) 



Note that the (unknown) tag set cardinality N is cancelled out. 
Using this estimator an estimate of N can be found, based on 
the fact that ElKs] = Np^ ~ h = Np^ and N = A:i+A:2+fc3. 
This yields an estimate of N for two reader sessions 

fci + k2 



ki + k2 = N - ks = N{1 - p-") ^ N - 
where p is given by fci and fc2 in Eqn. (O. 



(4) 



When estimates of p and N have been obtained, the 
probability of missing one or more tags can be calculated. As 
the probability of missing one tag in one reader session is p, 
the probability of not missing N tags in two reader sessions 
is (1 — p2)^. This gives the estimate of the probability of 
missing at least one tag as 

PM = l-{l-ff. 

If this probability is large, it is likely that tags are left unread. 
It is possible to improve the estimates by making more than 
two reader sessions. This is described in the following section. 

V. Generalization to Multiple Reader Sessions 

To provide better estimates, the two-reader session case is 
extended to support more independent reader sessions. The 
observable sets fci and fc2 are extended by defining a vector, 
k = [fci, . . . , fcfl]^, which holds information about how many 
tags were found in how many reader sessions. The first entry 
specify the number of tags found in R reader sessions, the 
second the number of tags found in i? — 1 reader sessions 
(regardless of which reader sessions) and so on. Each element 
in fc is defined by extending Eqn. ^ to: 



k,,^N 



R 



where i — {1,2, R}. However, when extending to more 
than two reader sessions, there are more relationships between 
the sets. In the two-reader session case the measurable sets are 
fci and fc2, and the ratio fci/fc2 is used (see Eqn. (|3]l), but others 
exist, namely fci /(fci + fc2) and fc2/(fci + fc2) which lead to 
the same estimator for p. When the number of reader sessions 
increases, then the number of measurable sets and the number 
of possible ratios increases, e.g. for three reader sessions, 
the sets are fci, fc2 and fc3, and possible ratios are fci/fc2, 
fci/fc3, fc2/fc3, fci/(fc2 + ^3), ^2/(^1 + ^2), etc. Therefore we 
do not have one good ratio with equally weighted sets, and 
common to almost all of the ratios is that an explicit expression 
for p does not exist, and p needs therefore to be calculated 
numerically. Before explaining some of the possible estimators 
of p, the estimator of N and the method of calculating the 
probability of missing one or more tags are explained. 

The estimator of N for R > 2 reader sessions is based on 
the estimator of the tag set cardinality from 18|. The percentage 
of resolved tags is (1 — p^), and the number of resolved 
tags is the sum of fc^s, therefore the tag set cardinality can 
be generalized to 



N ■ 



The estimate of the probability of missing at least one tag 
is extended to: 



Pa/ = 1 - (1 -P 



R\N 



(6) 



This estimator is useful if an application requires that the 
probability of one or more missing tags shall be lower than 
some threshold, ti (e.g. ti = 10~^). If pM > ti another reader 



session is required. A new pM is estimated for each reader 
session until the threshold set by the application is satisfied. 

As this estimated error, pM, is based on estimates of p 
and N, it relies on these being "good". Therefore it can be 
necessary to either 1) add an artificial bias to p^[, or 2) perform 
an extra reading after the criteria is satisfied. This is because 
Pm could be lower than ti in some cases where it should not, 
as p could be underestimated. 

Where both the estimates of N and pm are straightforward 
to compute given p, p itself is not easy to compute directly, 
because the estimate is found based on a ratio of sums of 
elements in fc, and the performance of the estimator depends 
on choosing a good ratio. 

VL Error Probability Estimators 

An estimator of the error probability is defined by which 
elements from k are included in the ratio's numerator and 
denominator respectively. Two window functions, (j>n{ki) and 
(pdih), are used to describe which elements are included. The 
ratio is then defined as: 

Ell ^S)h Ell (1 ' 

where Eqn. (|5]l is inserted cancelling out N. An example is 

r 1 if i = {2,...,i?}, 

"^-^'^ = 1, M^) = I otherwise, 

which for two reader sessions results in the ratio (fci + fc2) /fc2- 
The estimators of p proposed here are defined by their 
window functions. As the number of reader sessions increases, 
it becomes more likely that elements in k becomes zero. These 
elements do not provide any information, and are therefore 
excluded by setting (j>n{i) = 4>d{'i) — when ki = 0. This 
is used in the numerator window function for both proposed 
estimators of p: 



1 iffc^^O, 
otherwise. 



The difference in the estimators is then the denominator 
window function. 

Estimator 1: Remove Maximum Element 

The first estimator of p is based on the simple principle of 
removing the largest entry of k and all zero elements in the 
denominator. This gives the window function: 



(t>d{i) = 



1 if ki 7^ and ki ^ max fc, 
otherwise. 



This estimator is called the Remove Maximum Element 
(RME) Estimator 



Estimator 2: Remove Elements Greater than the Mean 

For the second estimator of p an averaged version of k is 
used. As is shown in Fig. [U the two subsets fc2a and k2b 
are added together into k2- Instead of using this sum, a new 
vector is defined, fc', containing estimates of these subsets. 
The estimate of the subset is the average of the entries in fc, 
with regard to the number of subsets per entry in k. This is 
defined as: 



k' 



A. 



k. 



The second estimator of p is named the Remove Elements 
Greater than the Mean (REGM) Estimator. The denominator 
window function is 

1 if fc' ^ and k[ < mj., , 
otherwise, 



where my is the sample mean of the nonzero elements in k'. 
This estimator removes all nonzero elements and aU elements 
greater than my . 

The Schnabel Estimator 

We propose to use the simple capture-recapture model, 
which provides an estimate of N. When the reader sessions 
are assumed to be independent, and as the tags are assumed to 
be in a closed population, the tag cardinality estimation can be 
assumed to be a simple capture-recapture experiment. When 
the number of reader sessions, R, is two, the Lincoln-Peterson 
method provides a maximum likelihood estimate [16J . where 
the tag set cardinality is found as 

nin2 



TO2 



where ni is the number of tags found in the first reader session, 
n2 is the number of tags found in the second, and m2 is the 
number of re-found tags in the second reader session. For more 
than two reader sessions, the Schnabel method from |14| can 
be used, which is a weighted average over a series of Lincoln- 
Peterson estimates 



Ns 



(8) 



where Mi is the total number of tags found in the (i — l)th 
reader session. Note that the two equations are equal for R = 
2, as Ml = mi = and M2 = rii. 

The method does not make an intermediate estimate of p, 
but finds an estimate of directly. To compare them and to 
make an estimate of the error probability pM, an estimator for 
p, ps, is derived. An estimate of the probability of success 
for the ith reader session is and the estimator is found, by 
averaging over the errors. 



PS 



1 ^ 

i— 1 



1 



which is the sample mean of the error probabilities found 
in all reader sessions. This is used for comparison and for 
calculation of pm as with the other estimators. 



g{h,k2)=p={ 



k2 



VII. Analytical Evaluation 

The analytical work is made for two reader sessions, as then 
an explicit estimate of p can be found. The estimator for p is 
a function of the observations ki and k2, denoted g{ki,k2)'. 

if fci = and fc2 = 0, 
if fci > and k2 = 0, (9) 
— tt— Otherwise, 

which follows from Eqn. (|3]l, but with two special cases where 
either no tags are found or all tags are found in both reader 
sessions. Its expected value is given in the following Lemma. 



Lemma 1 Let the estimate of p be defined as in Eqn. (O, then 
the expected value of p for known N and p is 

2N{p-p^^) 



E[g{ki,k2)\N,p] = 



2N 



2N - 1 



P 



Proof: See Appendix for the proof. ■ 
The above result shows that the estimator is biased, but as 
N increases and p decreases, then the bias can be neglected. 
The bias can in principle be removed, as it arises due to the 
definition of the estimator in the marginal cases. Appropriate 
choices of the marginal cases can make it unbiased. 

The lower limit for N is, if the expected error made is 
allowed to be e.g. 1% and p < 0.9, 

E[g{ki,k2)\N > 46, p < 0.9] -p< 0.01, 

that is, if the maximum assumed error probability is p = 0.9, 
then the minimum number of tags should be = 46 to satisfy 
the error requirement. 

The estimate of N is shown to be unbiased in the following. 



Lemma 2 Let the estimate of the tag set cardinality be defined 
as in Eqn. 0, then, for known N and p, the estimate of N is 
unbiased, that is E[N\N,p] = N. 



Proof: From Eqn. (|4| it follows that 



E[N\N,p] = V ^1±^Pt[Ki - fci,if2 = k2], 
^ — ' I — p^ 

ki ,k2 



and by inserting the multinomial distribution, and the proba- 
bilities for each set: 



N N-ki , , , 

ElNlN.pi ^ E E ^ 

ki=0 k2=0 



N 



p^ \ki, k2,N — ki — k2 

(1 -J5)2'=i(2(l -p)p)'=^p2(A'-fei-fe2)^ 

This can be split into two sums, and by the expectation of a 
multinomial distribution: 
1 



E[N\N,p\ 



1 -p2 
1 -p2 



{E[K{\+E[K2]) 
{[l-pf + 2[l-p)p)=N. 



This result ensures, given a good estimate of the error proba- 
bility, that the tag set estimator produces an unbiased tag set 
cardinality estimate. 

For the method to work, the tag sets found in each reader 
session have to be independent, as shown in e.g. the examples 
in Section U To investigate what happens if the reader sessions 
are dependent, the estimators are tested in scenarios with 
dependent reader sessions. The following section explains how 
the dependency is modelled, using a correlation coefficient to 
specify the correlation between reader sessions. 

A. Model for Dependent Reader Sessions 

So far it has been assumed that the reader sessions are 
independent, but what if this does not hold? In the following, 
a method is introduced to define the correlation for tag 
between the reader sessions and rj+i. For two reader 
sessions, define the Bernoulli random variable Xi signifying 
the outcome of one tag in the first reader session, and X2 the 
outcome in the second reader session, then 



Ai = 



1 w.p. p, 
w.p. 1 — p. 



1 w.p. pq+ {\ — p)r, 
otherwise. 



where p is the probability of a tag not being read in the first 
reader session, q is the probability that it is not read in the 
second reader session either, and r is the probability of a tag 
not being read in the second, but in the first. This gives the 
relations: 

Pr[Ai = l]=p Pr[Ai = 0] = 1 - p 

Pr[A2 = l|Ai = l] = q Pr[A2 = 0|Ai = 1] = 1 - g 
Pr[A2 = l|Ai = 0] = r Pr[A2 = 0|Ai = 0] = 1 - r 

It is assumed that the expected error probability remains the 
same between reader sessions, because of the random physical 
displacement of the tags. Therefore i?[Ai] = i?[A2], and 

pq+ (1 -p)r =p, 

where r and q forms the bound r < p < q because an error 
in the first reader session increases the probability of error in 
the second. 

To specify the level of correlation, the correlation coefficient 
is used, that is 

Cov(Ai,A2) _q-p 



1 



P 



where < p < 1. This yields the correlated probabilities q 
and r with respect to p and p as 



p{l - p) + p, r 



i-p 



(10) 



This is used to show how the presented approach to solve the 
missing tag problem is affected if the reader sessions are not 
independent. The results are shown in the following section. 



VIII. Simulation Evaluation 

To evaluate the estimators against each other, and to assert 
that they perform as expected, simulations have been carried 
out. The true number of tags is set to = 500 and each result 
is averaged over 1000 experiments. 

A. Independent Reader Sessions 
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Fig. 2. Simulated estimation of p vs. number of reader sessions for p = 0.2. 

The results of the estimate of p are shown in Fig. |2] It 
shows that the RME Estimator is not performing as good 
as the others. This is because the maximum element that 
is removed may contain almost all the tags and thereby all 
the information. By removing it, the estimator makes a bad 
estimate. The problem decreases, as the number of reader 
sessions increase as the tags are spread out in more sets. 
Because of the fluctuations for the RME Estimator in its 
estimate of p, it is not considered further and is not included 
in any of the following figures. 
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Fig. 3. Simulated estimation of A'^ vs. number of reader sessions for p = 0.2. 
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Fig. 4. Simulated MSE of A'^ vs. number of reader sessions for p = 0.2. 



The tag set cardinality is estimated in Fig. [3] The estimate 
given by the two estimators is similar, but the REGM Esti- 
mator converges faster to the true number of tags. This can 
be seen in Fig. |4] where the mean-square error of N is given, 
showing that the Schnabei Estimator converges to zero more 
slowly than the REGM Estimator. 
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Fig. 5. Simulated estimation of pM vs. number of reader sessions for p = 0. 1 
and p = 0.2. An example threshold is at 10~^. 



The estimate of p^ is the most important estimate, as it 
shows how many reader sessions are needed to be certain, 
with high probability, that all tags are resolved. Results are 
in Fig. |5] for p = 0.1 and p = 0.2. It can be seen that both 
estimators are close to the true pM calculated using Eqn. (|6| 
using true p and N as if they were known a priori. Therefore, 
if the error probability is p = 0.1, then the sequential decision 
process determines to stop after R — 8 reader sessions, and 
forp = 0.2 it is i? = 12, if the aflowed threshold is 10"^. The 
p = 0.2 case can be compared with Fig. [3] where it is seen, 
that all tags are found in approximately 8 reader sessions. 



B. Dependent Reader Sessions 

In the following the estimators are tested in scenarios 
where the independence assumption does not hold. For the 
simulations it is chosen to use p = 0.1 and p = 0.3, 
to demonstrate the effect of correlated reader sessions. The 
correlated error probabilities are found using Eqn. (fTOb . in 
which the correlation coefficient p is a parameter. 
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Fig. 6. Simulated estimation of p vs. number of reader sessions witii 
correlation between tiie reader sessions, p = 0.2 and the correlation coefficient 
is p = 0.1 and p = 0.3. 

The estimated error probabilities are shown in Fig. |6] 
where it can be seen, that the estimators are affected by the 
correlated reader sessions. The Schnabel Estimator converges 
to the correct error probability, where the REGM Estimator 
converges to some other error probability, depending on the 
correlation. 
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Fig. 7. Simulated estimation of A'^ vs. number of reader sessions with 
correlation between the reader sessions, p = 0.2 and the correlation coefficient 
is p = 0.1 for the upper values, and p = 0.3 for the lower values. 



Even though the error probability estimates for the REGM 
Estimator converges to wrong values of p, it performs better 



than the Schnabel Estimator when estimating the tag set 
cardinality. This is seen in Fig. [Tj where the REGM Estimator 
never provides an estimate lower then the actual number of 
resolved tags. Both estimators converges slower towards the 
true N, because of the correlation between the reader sessions. 
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Fig. 8. Simulated estimation of pjv/ vs. number of reader sessions with 
correlation between the reader sessions, p = 0.1 and p = 0.2, and the 
correlation coefficient is p = 0.3. An example thi'eshold is at 10^^. 

The estimate of the probability of completely missing one 
tag is shown in Fig.[8] where it can be seen, that the correlation 
affects the performance of the estimate. Therefore if the 
estimator is used as is, the estimate is wrong. The ideas for 
a solution to this, proposed in Section |Vl is to make some 
estimation margin, e.g. two additional reader sessions, so that 
more reader sessions than strictly necessary is used, to be 
certain the probability of missing one or more tags is below 
the chosen threshold. 

IX. Conclusion 

In this paper two different methods for obtaining the error 
probability estimate and the tag set cardinality estimate are 
proposed. The first method, named the REGM Estimator, is 
based on the assumption that it is possible to obtain statistically 
independent, uncorrelated reader sessions. First this estimator 
is introduced and explained with two reader sessions, after 
which it is extended to the general case. Then a method is 
devised to calculate the number of required reader sessions 
to guarantee, with some probability, that no tags are missing. 
The second estimator is based on the Schnabel method, known 
from capture-recapture literature, which is extended to also 
provide estimates of the error probability and the probability 
that tags are missing. 

It is shown that the REGM Estimator for the error probabil- 
ity for two reader sessions is biased, but that the bias becomes 
insignificant when the number of tags increases and the error 
probability decreases. Also, it is shown that the estimate of 
the tag set cardinality is unbiased in the case of two reader 
sessions. For the estimators to work it is important that the 
assumption of independent, uncorrelated reader sessions holds. 



To show how the estimators behave when the reader sessions 
are correlated, a model is devised for use in the simulations. 

Simulations are performed, which show that the tag set 
cardinality estimator using the estimated error probability 
from the REGM Estimator converges towards the correct 
value faster than the Schnabel Estimator. They also show that 
more reader sessions decreases the probability of a missing 
tag, indicating that the proposed method for estimating the 
probability of missing a tag is working. Experiments with 
dependent reader sessions show that the estimation of the tag 
set cardinality requires more reader sessions to be precise, but 
that the REGM Estimator's estimate of the tag set cardinality 
still converges faster than the one based on the Schnabel 
method. However both estimators underestimate the probabil- 
ity of missing one or more tags, resulting in a possibility of 
premature termination of the algorithm. To counter this, some 
estimation margin should be used when the reader sessions are 
dependent, and the analysis of this margin will be investigated 
in further work. Another interesting venue for future work is to 
investigate the cases when the reading errors have correlations 
across the tags in the same reader session. The future work 
should include evaluation of the proposed methods by using 
more detailed physical models for the tag reading errors. 
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Appendix 

The following is the proof of Lemma [T| 

Proof: The estimator is defined as in Eqn. (|9]l and the 
expected value E[g{Ki,K2)\N,p\ is 

E[g{Ki,K2)] = 5(^i,fe)Pr[^i = ^1,^2 = fe] 

N N-ki J 



E E 



2fci + k: 



Pr[Ki^kuK2^k2]- 



ki=0 k2=l 

Pt[Ki = 0,^2 = 0]. 

We insert the multinomial distribution with the probabilities 
for each set. 



N N-ki 



N 



2ki + k2 \ki,k2,N — fci - fc2 



ki=0 fc2 = l 



„2N 



■ p- E E 

ki=0 k2 = l 

N 



2ki + k2 



ki,k2,N - ki - k2j \ p J 

We define a function h, which is all but the two p^^ , and we 
differentiate it with respect to p. 



dh 
dp 



N N-ki 



E E 2^-^^^ 
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N 

ki, k2, N — ki — fc- 



, X 2A;i+fc2-l / T 

1 — p \ / — 1 
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N N--ki 

E E^^ 

ki=0 k2=0 
(1 - p)2'=i (2(1 - p)p)'=^p2(JV-/ci-fc2) 
1 



N 

ki, k2, N ~ ki — /c- 



-E[K2 



(1 -p)p2W+l 

^-2lv+T^2(l-p)p=-^. 



(1 -p)p2A^+l 



p. 



This function is integrated and merged with the part not 
differentiated, this gives 

f 2N , 2iVpi-2iv 

h = / --wdp = -r^^ + C 



E[g{Ki,K2)] ^p 



2N 



p2N - 2N -1 

' 2Np^-^^ 
2N - 1 



+ c\ +p- 



2N 



By inserting known p, and solving with respect to the integral 
coefficient c, c is found to — ^^rj, and the expected value is 

_ 2N{p-p'^) 
~ 2N-1 ' 
which is approximately p for large N. ■ 



